graph
A[Start New RNA-Seq Project] --> B[Set Up Directory Structure]
B --> C[Create R Project ]
C --> D[renv for Managing Dependency]
D --> E[Set Up Git Version Control]
E --> F[Connect to GitHub Repository]
F --> G[Develop RNA-Seq Analysis Scripts]
G --> H[Commit & Push Changes Regularly]
1. Maintaining Directory Structure (RNA-Seq Projects)
Recommended RNA-Seq Project Layout:
RNA-Seq_ProjectName/
├── data/
│ ├── raw_data/ # FASTQ or BAM files (read-only)
│ ├── reference_data/ # Reference genome, GTF, annotations
│ ├── meta_data/ # Sample information & others (CSV, TSV)
│ └── processed_data/ # Derived data
│ ├── trimmed_data/ # Adapter-trimmed FASTQ files
│ ├── alignments_data/ # Alignment outputs (BAM/SAM)
│ └── counts_data/ # Gene/transcript counts
│
├── results/
│ ├── qc/ # FastQC, MultiQC reports
│ ├── differential_expression/ # DESeq2, edgeR, limma results
│ ├── functional_profiling/ # GO, KEGG enrichment
│ └── final_figures/ # Publication-ready plots
│
├── reports/ # Quarto or RMarkdown reports
├── scripts/ # R or shell scripts
├── R/ # Custom R functions
├── logs/ # Pipeline logs
└── README.mdTip
- Keep
data/raw_data/read-only.
- Separate code, data, and results.
- Use
here::here()orfs::path()for reproducible paths. - Document folder purpose in
README.md.